Wrap-Up: a Trainable Discourse Module for Information Extraction
نویسندگان
چکیده
The vast amounts of on-line text now available have led to renewed interest in information extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper presents a novel approach that uses machine learning to acquire knowledge for some of the higher level IE processing. Wrap-Up is a trainable IE discourse component that makes intersentential inferences and identi es logical relations among information extracted from the text. Previous corpusbased approaches were limited to lower level processing such as part-of-speech tagging, lexical disambiguation, and dictionary construction. Wrap-Up is fully trainable, and not only automatically decides what classi ers are needed, but even derives the feature set for each classi er automatically. Performance equals that of a partially trainable discourse module requiring manual customization for each domain.
منابع مشابه
Wrap - Up : a Trainable DiscourseModule
The vast amounts of on-line text now available have led to renewed interest in Information Extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper presents a novel approach that uses machine learning to acquire knowledge for some of the higher level IE processing. Wrap-Up is a trainable IE discourse componen...
متن کاملWrap - Up : a Trainable DiscourseModule for Information
The vast amounts of on-line text now available have led to renewed interest in information extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper presents a novel approach that uses machine learning to acquire knowledge for some of the higher level IE processing. Wrap-Up is a trainable IE discourse componen...
متن کاملAAAI 1995 Spring Symposium on Empirical Methods in Discourse Interpretation and Generation Learning Domain-Speci c Discourse Rules for Information Extraction
This paper describes a system that learns discourse rules for domain-speci c analysis of unrestricted text. The goal of discourse analysis in this context is to transform locally identi ed references to relevant information in the text into a coherent representation of the entire text. This involves a complex series of decisions about merging coreferential objects, ltering out irrelevant inform...
متن کاملLearning Domain-Specific Discourse Rules for Information Extraction
This paper describes a system that learns discourse rules for domaln-speclfic analysis of unrestricted text. The goal of discourse analysis in this context is to transform locally identified references to relevant information in the text into a coherent representation of the entire text. This involves a complex series of decidons about merging coreferential objects, filtering out irrelevant inf...
متن کاملTools and techniques for rapid porting
Charlie Dolan, from Hughes Research Laboratories, discussed some of the difficulties in using trainable components in an information extraction system. The UMass/Hughes system used six different trainable components in their MUC5 system ; portability between the EJV and EME domains was achieved partl y through retraining these components. One of these components, the Trainable Template Generato...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Artif. Intell. Res.
دوره 2 شماره
صفحات -
تاریخ انتشار 1994